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HOMOTOPIES: 

A  panacea  or  just  another  method  ? 

Isidore  Rigoutsos 
Computer  Science  Department, 

University  of  Rochester, 

Rochester,  NY  14627. 

Abstract 

A  general  framework  is  presented  for  the  solving  of  non-linear  equations.  Also,  a 
discussion  about  its  potential  applications  in  the  field  of  Computer  Vision  is  made 
and  illustrated  by  an  example  that  shows  how  one  can  relate  the  solutions  to  the 
Shape  from  Shading  problem  through  Scale  Space.  The  methods  presented  seem  to 
have  been  known  since  the  19th  century.  However,  it  was  not  until  1953  that  the 
first  practical  applications  of  the  relevant  idea  appeared. 

Index  terms:  Hilbert  space,  Homotopies,  Path  following,  Scale  Space,  Shape  from 
Shading. 

1.  Introduction 

In  a  great  many  applications,  one  faces  the  problem  of  solving  one  or  a  system  of 
non-linear  equations.  They  could  be  non-linear  equations  determining  the  amount  of 
life  insurance  that  one  should  have  ([16]),  the  pressure  changes  inside  a  gap  between 
two  objects  moving  very  fast  past  to  each  other  ([16]),  the  parameters  involved  in  a 
calibration  problem  ([20]),  the  surface  normal  at  a  point  (x,  y)  of  an  object  in  the 
shape  from  shading  problem  ([19])  etc. 

The  diversity  of  the  above  examples  shows  why  solving  non-linear  equations  is 
an  important  issue  in  current  scientific  applications,  and  explains  the  abundance  of 
the  numerical  methods  currently  available  for  this  purpose. 

Most  methods  for  solving  one  equation  readily  extend  to  methods  that  solve  a 
system  of  linear  equations.  In  general,  these  methods  are  characterized  by  (at  least) 
two  parts:  an  "iteration”  part  and  a  "convergence  test”  part.  In  what  immediately 
follows,  we  are  concerned  about  the  first  of  these  two  parts. 

Consider  the  equation  fix)  =  0.  The  basic  parameters  of  an  iteration  that  solves  the 
former,  are: 

-  a  low-degree  polynomial  model  for  fix) 

-  an  assumption  about  the  behavior  of  fix) 


The  most  broadly  used  methods  are: 

a)  the  bisection  method :  it  assumes  no  model  for  fix)  and  tries  to  reduce  the  radius 
of  the  neighborhood  around  the  solution  of  fix)  =  0  to  the  value  0,  hence  finding 
the  solution  x*  to  the  latter  equation  (Fig  l.a). 

b)  Newton’s  method :  it  assumes  a  linear  model  for  fix)  and  uses  the  slope  of  the 
line  tangent  at  the  point  x  of  fix)  (Fig.  l.b). 

c)  the  secant  method:  it  is  a  variation  of  Newton’s  method  that  avoids  computing 
the  derivative  of  fix);  it  also  assumes  a  linear  model  based  on  the  two  most  recent 
values  of  fix)  (Fig.  l.c). 

d)  the  reeula  falsi  method :  it  applies  in  the  same  context  as  the  bisection  method, 
except  that  now  a  straight  line  model  is  assumed  (Figl.d). 

e)  the  fixed-point  method :  one  reformulates  fix)  =  0  as  fix)  +  x  =  x  and  finds  the 
point  of  intersection  of  gj(x)  =  fix)  +  x  and  g2(x)  =  x,  i.e.  the  fixed  point  of  gt(x) 
(Fig.  I.e). 

f)  the  Muller  method:  extension  of  the  secant  method  that  assumes  a  quadratic 
model  (Fig.  l.f). 


2.  Motivation  and  Previous  work 

The  motivation  for  this  study  of  the  use  of  homotopies  was  a  paper  entitled 
"Signal  Matching  through  Scale  Space ”([15]).  This  paper  deals  with  the  solution  to 
the  Signal  Matching  Problem  through  Scale  Space  and  how  it  relates  to  the  solution 
of  some  (relevant)  known  Computer  Vision  problems.  The  presentation  was  rather 
clear,  however,  the  paper  also  contained  a  not-so-brief  exposition  of  the  authors’ 
knowledge  about  solving  problems  of  that  kind.  As  a  matter  of  fact,  after  having 
formulated  the  original  problem  as  an  optimization  problem,  they  proceeded  by 
suggesting  the  "stabilization  methods”  ([12])  as  a  candidate  for  the  solving  of  this 
problem.  This  approach  generated  a  non-linear  system  of  equations;  they  then 
proposed  the  use  of  "continuation  methods”  ([11])  to  solve  this  non-linear  problem. 
However,  they  rejected  this  latter  alternative  and  (apparently)  used  their  intuition 
to  provide  us  with  a  "...  a  more  attractive  alternative  ...”,  that  of  gradient  descent 
through  scale  space,  which  generates  a  set  of  equations  "describing”  the  problem. 
However,  once  more  time  they  rejected  the  equations  that  resulted  as  involving 
”... high-dimensional  spaces  and  suggested  a  new  (approximately)  equivalent  to 

the  former,  new  system  of  equations,  which  they  finally  solved. 

Although  intuition  is  always  attractive,  it  would  clearly  be  useful  to  provide  a 
general  framework  in  which  the  Signal  Matching  problem  as  well  as  all  other 
optimization  problems  could  be  solved.  Such  a  framework  is  presented  in  what 
follows.  We  start  with  a  discussion  about  the  use  of  problem  imbedding  during  the 
last  few  years. 

For  about  a  century,  scientists  have  known  about  imbedding  methods.  However, 
the  homotopies  did  not  specify  how  one  can  accurately  calculate  the  solution  of  an 
equation  or  system  of  equations,  or  the  fixed  point  of  a  function.  Using  them  it  was 
only  possible  to  verify  whether  it  is  theoretically  possible  to  reach  a  desired  point. 
The  first  major  step  in  the  20th  century  was  taken  by  Davidenko  ([18]).  Davidenko 
was  the  first  one  that  built  on  the  homotopy  idea  by  adding  path  following  aspects  to 
it.  He  provided  a  set  of  differential  equations  that  "ruled”  the  path  leading  from  an 
initial  position  to  the  desired  point;  these  differential  equations  were  applied  to  a 
great  many  deal  of  problems  ranging  from  determinant  evaluations  and  polynomial 
root  finding  to  eigenvalue  and  boundary  value  problems  ([13],  [18],  [9]).  In  general, 
his  ideas  led  to  algorithms  that  produced  differentiable  paths. 

(Note:  By  "path”  we  mean  a  piecewise  differentiable  curve  in  n-dimensions ) 


Another  more  recent  approach  to  path  following  was  initiated  by  Scarf  ([2]);  Scarf 
did  not  use  the  homotopy  approach  but  a  new  method  based  upon  the  notion  of  the  so 
called  "primitive  sets”.  This  approach  led  to  algorithms  generating  piecewise 
linear  paths.  The  underlying  principle  of  both  approaches  is  the  same: 

"  ...  starting  from  here,  follow  a 
path  that  leads  there...  ”. 

It  has  to  be  pointed  out  that  there  is  no  best  path  following  approach.  The 
approach  that  has  to  be  taken  by  somebody,  is  closely  related  to  the  problem  under 
consideration. 

The  layout  of  this  paper  is  the  following: 

Section  3  is  a  tutorial  on  homotopies.  Section  4  discusses  computational 
considerations  of  the  methods  that  solve  systems  of  differential  equations.  Section  5 
contains  a  discussion  on  how  to  get  by  local  extrema,  and  finally  Section  6 
illustrates  the  methods  presented  in  Section  3,  with  an  example  showing  how  one 
can  relate  the  solutions  to  the  Shape  from  Shading  problem  through  Scale  Space,  and 
also  mentions  other  potential  applications  of  the  homotopy  methods  in  the  field  of 
Computer  Vision. 

3.  HOMOTOPIES  ( imbeddings ) 

3.1  A  description 

Let  E  be  a  real  Hilbert  space,  and  let  F:  E  -*  E  be  a  nonlinear  operator. 

Suppose  that  we  are  concerned  with  the  numerical  solution  of  the  equation  (actually 
the  system  of  equations  ): 

F(x)  =  0  (1) 

Suppose  also  that  x*  is  a  solution  of  (1).  A  well-known  method  for  approximating  x* 
is  Newton’s  method.  It  consists  of  the  calculation  of  a  sequence  of  approximations  { 
xk  },  where 

xk+i  =xk-[F'(xk)]1  F(xk),k  =  0,l,2,3...(2) 

In  (2),  xO  is  a  given  approximation  to  x*.  The  "  '  ”  denotes  the  locally  Lipschitz 
continuous,  Frechet  derivative  of  F  at  x. 


f 

I 


Definition  1: 

A  mapping  F:  D  CE  -*  E  is  Lipschitz  continuous  in  Do  C  D  if  there  exists  a  constant  c 
such  that: 

Vx,y(  D:  ||  F(y)  -  F(x)  |  s  c  |  y  -  x  | 


Definition  2: 

Consider  the  mapping  H:  E  -*  E.  H  is  said  to  be  Frechet  differentiable  at  x  € 
interior(E)  if  there  is  a  linear  operator  A:  E  -*  D  C  E  such  that 

limit  H H(x  +  h)  -  H(x)  -  A(h)||  _  Q 

h  0  i  h  II 

The  linear  operator  A  is  denoted  by  F'(x)  and  is  called  the  Frechet  derivative  of  H  at 
x. 


Note:  This  last  condition  is  a  uniformity  condition  which  allows  most  of  the  usual 
properties  of  derivatives  in  one  dimension  to  be  carried  over  to  n  dimensions.  For 
example,  if  H:E  -►  E  is  Frechet-differentiable  at  x,  then  H  is  continuous  at  x  ([4]). 

Unfortunately,  Newton’s  method  suffers  from  a  problem;  in  particular,  when  x°  is 
remote  from  x*  then  the  sequence  {xk}  defined  by  (2)  will  generally  not  converge  to 
x*.  In  such  cases,  homotopies  (imbeddings)  appear  to  be  a  useful  tool  for  generating  a 
sequence  {xk}  that  converges  to  x*. 

Definition  3: 

A  homotopy  is  a  mapping  H:  E  X  [t°,  t1]  ■*  E,  for  which  H(x,t)  depends 
continuously  on  t,  such  that: 

H(x,  tO)  =  G(x)  (3a) 

H(x,U)  =  F(x)  (3b) 


with  H(x,  tO)  *  G(x)=  0  being  an  easily  solvable  system  and  G:  E  -*  E  an  (in 
general)  non-linear  operator.  [t°,  t1]  is  a  closed  interval  in  R. 


F  is  imbedded  (mapped)  in  the  family  of  operators, 

{  H(-,t)  /t€[tO,tl]  }  (4) 

Now  instead  of  the  single  problem  that  we  initially  had,  we  have  the  whole  family 
of  problems: 

H(x,  t)  —  0  /t€[tO,ti]  (5) 

Let  uO  £  E  be  the  solution  of  the  equation  H(x,  tO)  =  0.  Suppose  that  (5)  has  a 
unique  solution  x  =  U(t)  depending  continuously  on  t.  We  therefore  have 

H(U(t),t)  =  0  /t«[tO,tl]  (6) 

and 

U(tO)  =  u0,  U(ti)  =  x* 

Thus  U  defines  a  curve  ( path  )  in  E  with  starting  point  uO  and  ending  point  equal 
to  the  solution  x*  of  (1). 

3.2  An  example 

Let  us  now  illustrate  the  homotopy  method  with  an  example.  Consider  the 
following  nonlinear  system 

2xj3  +  12x,2  +  8xt  +  9x2  +  36  =  0 
2xj2  +  3x2  +  4  =0 

Assume  now  that  our  initial  system  of  equations  is  the  following 

2x,3  +  8Xj  +  9x2  =  0 
3x2  =  0 

The  unique  (  real )  solution  of  this  system  is  (xp  x2)  =  (0, 0). 

Now  consider  the  real  parameter  t  £  [to,  t1]  =  [0,1],  and  create  a  new  system  of 
equations  such  that  this  new  system,  at  t  =  1,  becomes  "equal”  to  the  one  we 
initially  had.  This  new  system  is: 

2Xj3+  8Xj  +  9x2  +  t(  12Xj2  +  36)  =  0 
3x2  +  t(2xj2  +  4)  =  0 

Denoting  the  solution  by  (Xj(t),  x^t)),  we  can  see  that  (Xj(0),  x2(0))  =  (0,0)  whereas 
(Xj(l),  x2(l))  is  the  "point”  we  wish  to  find.  Elimination  of  Xj  from  the  above  system 
results  in 

(x,  +  3t)(2x,2  +  8  )  =  0  =* 
x^t)  =  -3t 

x2(t)  =  -t(18t2  +  4)/ 3 

A  graph  (Fig.  2)  of  the  solution(s)  of  the  parameterized  system,  follows. 

At  t=l,  Xj  =  -3  and  x2  =  -22/3,  which  are  the  non-zero  solutions  of  the  initial 
system.  Although  we  could  equally  well  solve  the  above  non-linear  system  of 


equations  without  introducing  the  parameter  t,  the  utility  of  the  approach  extends 
to  numerical  and  not  just  analytic  methods. 

The  simplicity  of  the  previous  example  may  make  one  think  that  "inspection” 
suffices  and  path  following  is  superfluous.  However,  in  n-dimensional  spaces,  where 
visual  interpretations  can  hardly  exist,  we  are  forced  to  "inch”  along  the  path. 

Actually  the  exact  path  that  occurs  depends  directly  upon  the  selected  homotopy 
function  H(x,t). 

We  have  specified  the  types  of  G,  H  in  (3)  but  we  have  not  elaborated  upon  the 
form  of  H(x,t).  Thre  are  three  common  homotopies: 

1)  Newton  homotopy:  the  form  of  this  homotopy  is: 

H(x,t)  =  F(x)  -  (t*  -t)F(xO),  with  t°  =  0  and  t1  =  1 
named  after  Sir  Isaac  Newton.  Observe  that  G(x)  =  F(x)  -  F(xO)  which  has  the 
obvious  solution  x  =  xO. 

2)  Fixed  point  homotopy:  the  form  of  this  homotopy  is: 

H(x,t)  =  (t1  -  t)(x  -  xO)  +  tF(x),  with  t°  =  0  and  t1  =  1 


A  common  characteristic  of  the  first  two  homotopies  is  that  they  can  be  started  at 
any  point  xO. 


3)  Linear  homotopy:  the  form  of  this  third  homotopy  is: 

H(x,t)  =  G(x)  +  t[F(x)  -  G(x)],  with  t°  =  0  and  t1  =  1 
i.e.  this  homotopy  is  a  linear  combination  of  G(x)  and  F(x).  One  can  easily  observe 
that  the  linear  homotopy  subsumes  both  the  Newton  (in  which  case  G(x)  =  x  -  xO  ) 
and  the  Fixed  Point  (in  which  case  G(x)  =  F(x)  -  F(xO)  )  homotopies;  it  is  used 
whenever  one  wants  the  starting  function  G(x)  to  have  some  special  properties  ( as  in 
the  previous  example). 

3.3  Types  of  imbeddings 

In  the  sequel  we  shall  describe  some  ways  in  which  the  homotopy  idea  can  be  used 
in  a  numerical  approximation  of  x*,  the  latter  being  the  solution  of  (1). 

1)  Discrete  Imbedding:  this  consists  of  successive  numerical  approximations  of  the 
solutions  of 

=  0,  for  i  =  0, 1, 2, 3, ...  N 

N  is  an  integer  and  the  set  {t^}  is  a  partition  of  [t°,  t*].  As  a  starting  point  for 
approximating  the  solution  at  t  =tm  the  solution  at  t  =  tm4  is  used.  This  method  is 
also  known  under  the  name  continuation  method  ([11],  [15]).  In  this  approach,  if 
the  distance  i  -  t^i  is  sufficiently  small,  then  a  sequence  of  solutions  that 
converges  to  x*  can  possibly  be  found. 

2)  Transformation  to  an  initial  value  problem:  this  is  a  totally  different  approach 
to  approximate  x*.  Assume  that  the  mapping  U(t)  is  continuously  differentiable  on 
[tO.tl]  and  that  H  has  locally  Lipschitz  continuous  partial  Frechet  derivatives;  then 
we  obtain  by  differentiation  of  (6)  with  respect  to  t,  a  system  of  differential  equations 
which  together  with  the  initial  condition  U(t°)  =  u0  form  an  initial  value  problem. 
The  numerical  solution  of  the  latter  can  subsequently  be  obtained  by  application  of  a 
standard  numerical  integration  procedure.  The  resulting  solution  can  then  be  used 
as  the  starting  point  of  an  iterative  method  that  approximates  x*  more  closely.  This 
approach  is  attributed  to  Davidenko  ([13],  [18])  and  is  explained  in  the  sequel  in 
more  detail. 
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3)  Iterative  imbedding:  this  is  the  least  frequently  used  of  all  imbeddings.  In  this 
case  one  elaborates  on  the  initial  value  problem  idea.  In  particular,  the  homotopy  is 
supposed  to  be  a  linear  one  (see  above);  then  differentiation  with  respect  to  t  leads  to 
an  initial  value  problem  of  the  form: 

dx/dt  =  -[(ti-t)*0,G(x(t),xO)  +  t*F'(x(t))]1*[-G(x(t),xO)  +  F(x(t))] 

with  tl  =  1  and  t°  =  0  and  x(tO)  =  x0([3]).  0t  is  the  partial  Frechet  derivative  of  H 

with  respect  to  t.  The  above  system  is  solved  by  the  following  iterative  process: 

xk  +  l  =  V(xk), 

where  x1  s  V(xO)  is  the  solution  of  that  system  at  t=l,  by  means  of  a  numerical 
integration  procedure.  Then,  xO  is  replaced  by  xl,  which  in  turn  is  replaced  by  the 
new  derived  solution  x2,  and  so  on... 

Up  to  this  point  we  have  ignored  the  following  extremely  important  questions: 
a)existence,  bkontinuity,  and  c)uniquenes  of  the  mentioned  path. 

3.4  Path  Existence 

Let  us  give  some  definitions  first.  Define 

H  1  ={(x,t)/H(x,t)  =  0}  (7) 

to  be  the  set  of  all  solutions  (x,t)  €  EX[tO,t*]of  the  initial  system  of  equations 

H(x,t)  =0 

In  general,  H  1  can  be  rather  unconstrained.  The  points  (x,t)  that  satisfy  H(x,t) 
=  0  could  be  all  over  the  place,  experiencing  no  particular  configuration.  Let  also 

H  l(t)  ={x/H(x)  =  0}  (8) 

i.e.  H  *(t)  is  the  set  of  the  solutions  for  a  fixed  value  t.  One  can  immediately  observe 
that 

H  l(t0)  ={(X0)/H(x,t0)  =  G(x)  =  0} 
and  H'(tl)  ={x*  =  x(ti)/H(x,tU  =  F(x)  =  0} 

Therefore,  we  can  see  that  the  task  of  determining  whether  there  are  any  paths  of 
solutions  becomes  equivalent  to  determining  whether  H  1  is  non-empty.  One  should 
keep  in  mind  that  there  may  be  cases  where  H  1  consists  of  curves  that  may  not  be 
paths.We  now  proceed  towards  their  elimination  in  order  to  ensure  that  H  1  contains 
only  paths. 

Recalling  the  definition  of  a  Jacobian,  we  can  write: 

H'  =  (  H'x,  6H  /  9t ),  (9) 

where 
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H't  =  (  GHi  /0t  6H2/et  0H3/6t  . 0Hm/et)t  (11) 

At  this  point,  we  need  to  recall  the  well-known  Implicit  Function  Theorem  from 
the  theory  of  Functional  Analysis  ([1]) 

Theorem :  Let  H:  E  -  E  be  defined  and  continuously  differentiable  in  a 
neighbourhood  ExT  of  a  point  (xi,  X2,  X3,  ...  xm,  t)  *  (xp,  tp)  of  ExftO.t1]  such  that 
H(xp,  tp)  =0  and  the  Jacobian  H'x  is  invertible  at  (xp,t).  Then  there  is  an  open 
neighborhood  Wo  c  E  of  (xi,  X2,  X3,  ...  xm)  such  that  for  any  connected  open 
neighborhood  W  c  Wo  of  (xi,  X2,  X3, ...  xm)  there  is  a  unique  function  g: 

g:  E  X  [tO.tl]  with  g(xp)  =  tp 

defined,  continuous,  and  continuously  differentiable  in  W  such  that 

H(x,  g(x))  =0  V  x  i  W 
For  a  proof  of  this  theorem  see  [1],  p.  265-268. 

This  theorem  immediately  excludes  forks,  crossings,  splittings  or  other  anomalies 
that  may  occur.  We  are  therefore  guaranteed  that  our  path  is  extremely  well 
behaved.  One  can  proceed  further  and  obtain  the  following  corollary  ([2]): 

Corollary:  Let  H:  E  -  E  be  defined  and  continuously  differentiable  in  a 
neighbourhood  ExT  of  a  point  (xi,  X2,  X3, ...  xm,  t)  ■  (xp,  tp)  of  Ex[to,U]  such  that 
H(xp,  tp)  =0.  Suppose  now  that  for  all  such  points  p  of  Ex[to,t*]  with  p  (  H  ‘,  the 
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Jacobian  H'x  is  invertible  at  p  =  (xp,tp).  Then  H  1  consists  only  of  continuous  and 
continuously  differentiable  paths. 

Hence  the  path  existence  and  continuity  can  be  ensured  given  the  validity  of  the 
above  theorem/corollary  hypotheses.  As  for  the  uniqueness  one  can  easily  see  that  if 
the  equation  (1)  has  a  unique  solution,  it  follows  from  the  implicit  function  theorem 
that  the  cardinality  of  H  1  equals  1.  A  more  elaborate  discussion  on  existence  issues 
can  also  be  found  in  [6]. 

3.5  Moving  along  paths 

So  far  we  dealt  with  results  refering  to  the  existence,  uniqueness  and  continuity 
of  the  paths.  But  we  have  not  discussed  at  all  how  one  can  find  these  paths. 

3.5.1  Davidenko’s  Approach 

Davidenko  ([13],  [18])  was  the  first  one  that  introduced  the  differential  equation 
as  a  means  for  solving  H(x,t)  =  0  with  t  €  [tO,tl].  His  approach  consisted  of  the 
following:  he  differentiated  (1)  with  respect  to  parameter  t  and  obtained  the  system 
of  equations: 
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Obviously,  this  system  of  equations  requires  that  H'x  is  non-singular.  Furthermore, 
these  equations  used  t  as  a  parameter([4]).  The  use  of  t  as  a  parameter  is  not  a  good 
idea,  since  it  creates  the  following  problem:  "at  a  branching  point,  the  matrix  H'x 
becomes  singular  and  any  integration  method  obviously  fails”  (see  also  the 
discussion  about  local  extrema,  in  Section  5).  One  may  want  to  adapt  their  method  in 
such  a  way  so  that  it  is  able  to  handle  branching  points  (and  local  extrema  with 
appropriate  extensions).  The  first  such  adaptation  was  by  Klopfenstein  ([5])  who 
introduced  a  natural  parameterization  using  the  arc  length  of  the  solution  locus;  the 
latter  parameterization  is  apparently  a  powerful  technique  and  does  not  suffer  from 
the  above  mentioned  problems. 


3.5.2  The  Basic  differential  equations 

Let  us  now  use  the  parameter  's’  to  denote  the  distance  moved  along  the  path  ( 
length  of  the  arc ).  Let  therefore 

Y(s)»  ( yi(s),y2(s),y3(s), ...  ym(s),  ym+1(s) )  =  (x(s),t(s))  (12) 

where  we  denote  xj  by  yi,  V  i  €{1,  2,  3, ...  m  },  and  t  by  ym  +  1 ,  for  clarity.  Observe  that 
the  point  Y(s)  tells  us  where  we  are,  after  having  travelled  a  distance  s  along  the 
path.  Since  as  s  varies  Y  =  Y(s)  describes  a  path  in  H1,  we  conclude  that 

H(Y(s))  =  0  (13) 

Differentiation  of  the  latter  with  respect  to  s,  and  application  of  the  chain  rule,  gives 


1  e  h  dyi(s) 

V - =  o  (14) 

,Ti  6>'i(s)  ds 

where  0H  /  0yi(s)  is  the  i-th  column  of  the  Jacobian  H'.  Usually,  H  is  an  m- 
dimensional  vector,  H  =  (Hi,  H2,  ...,  Hm),  in  which  case  this  latter  equation  is  a 
system  of  m  linear  quations  in  the  m+1  unknowns  dyVds,  i  =  1,2,3,  ...  m  +  1.  The 
(m  +  l)th  equation  is  the  following: 
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which  results  from  the  very  definition  of  the  length  of  the  arc  ([17]). 
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In  what  follows,  we  adopt  the  following  notation: 

Notation:  By  H  (  Ck  we  mean  that  H  has  continous  derivatives  up  to  thek-th  order. 


Definition  4:  An  operator  H  is  called  regular  if  the  Jacobian  matrix  H '( x,t)  has  a 
rank  equal  to  m(  i.e.  equal  to  the  size  of  the  shortest  side  of  the  Jacobian  matrix  ). 


The  Basic  Differential  Equations  theorem :  Let  H:  EXE,  H  €  C2,  and  H  be  regular. 
Given  a  starting  point  yO  in  H1,  the  solution  of  the  "basic  differential  equations” 
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starting  from  y(s)  =  y°  is  unique  and  determines  a  path  in  H  1  ([2]). 


( Note:  by  *H'(y)  we  denote  the  m  x  m  matrix  that  results  if  we  delete  the  i-th  column 
of  the  Jacobian  matrix  H'(y). ) 

The  Basic  Differential  Equations  provide  a  general  means  to  follow  a  path  in  H  1 
and  therefore  obtain  a  solution  point  x*  €  H  Ht1).  However,  the  Basic  Differential 
Equations  approach  is  far  more  than  merely  a  means  to  solve  a  system  of  differential 
equations.  They  also  provide  immediate  and  important  information  concerning  the 
direction  ( =  orientation  )  of  the  path. 

3.6  Which  wav  are  we  travelling  -  coming  or  going  ? 

While  dealing  the  system  of  the  Basic  Differential  Equations  it  is  important  to 
consider  certain  aspects  of  the  problem.  One  may  naively  think  that  with  increasing 
s,  t  increases  as  well.  Unfortunately,  this  is  not  the  case  as  it  can  be  easily  seen  from 
the  following  figure  (Fig.  3)  reproduced  from  [10].  Suppose  that  we  are  at  t  =  t°  and 
desire  to  increase  t.  If  t  >  0  then  increasing  s  will  increase  t.  Respectively,  if  t  <  0 
the  opposite  is  true.  Hence,  we  must  first  check  the  sign  of  t.  Similar  phenomena 
would  occur  if  we  moved  from  some  position  (x2,t2)  to  a  new  one  (x3,t3)  along  the 
path;  which  way  to  change  s  is  not  clear  beforehand. 

3.7  Using  which  approach  should  one  try  to  "move"  ? 

The  system  of  equations  that  results  from  Davidenko’s  approach  can  be  solved 
whenever  H'x  has  a  rank  equal  to  m  (see  above).  On  the  other  hand,  if  H  is  regular 
we  may  write  y  =  (x,t)  as  a  function  of  s  and  obtain  the  set  of  Basic  Differential 
Equations. 

The  assumption  of  the  "regularity  of  H  V’  is  a  rather  mild  one  and  holds  almost 
always.  No  matter  which  system  we  decide  to  solve,  both  systems  must  (and  will) 
generate  the  same  path  since  they  were  derived  from  the  same  homotopy. 

4.  Computational  considerations 

Davidenko’s  approach  can  be  used  whenever  there  is  a  problem  during  the 
evaluation  of  the  determinants  involved  in  the  Basic  Differential  Equation 
approach.  However,  one  can  also  face  some  problems  using  the  former  method  since 
the  evaluation  of  the  inverse  Jacobian  H'x  may  involve  too  much  computational 
effort.  The  usual  tricks  that  are  used  in  this  case  are: 

1)  use  difference  formulas  instead  of  calculating  the  derivatives  of  H'x. 

2)  do  not  evaluate  the  Jacobian  at  every  step  but  rather  every  few  steps. 


4.1  Choosing  the  correct  numerical  method 

Recall  the  Euler  method  formula  for  the  general  initial  value  problem: 

dy  w 

-  =  f(x,y),  where  y  £  R  ,x  €  R  and  y(0)  is  known 

dx 

Euler’s  formula  is 

yk  +  l  =  yk  +  f(xk,yk)  ( yk  + 1  -yk),  where  0<yl  <>-2<y3<  .  .  <  yk  <  yk+1  A  y(0)  =  y° 

Euler’s  method  is  a  simple  method  based  on  the  linear  model  and  has  a  characteristic 
low  accuracy.  In  practice  the  round-off  error  becomes  the  limiting  factor  of  the 
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method;  in  fact  Euler’s  method  tends  to  drift  further  and  further  from  the  exact  curve 
(Fig.  4). 


Euler’s  method  is  characterized  by  a  one  step  computation;  in  fact  it  goes  from  xk  to 
xk  +  1  without  any  additional  information.  Other  methods,  known  as  multistep 
methods,  use  information  from  x\  xkl,  xk  2,  xk  3, ...  to  compute  yk  +  1  at  xk  +  I,  therefore 
overcoming  much  of  the  drifting  of  the  Euler  method  ([16]). 

A  special  class  of  these  multistep  methods  that  forms  the  basis  of  the  software 
now  used  for  solving  differential  equations,  is  Adams  methods,  they  are  based  on  the 
formula 

^k  + 1  + 1 

y(xk  +  1)  =  yk  +  j  k  y’(x)dx  =  |k  f(x,y(x))dx 

One  usually  models  ftx,(y(x)))  by  a  polynomial  interpolating  y'(x)  at  various  points; 
then  we  obtain  the  Adams  formulas  from 

»k+1 

yk  +  1  =yk  +  J  k  p  (xVix 

The  particular  formulas  depend  on  the  interpolation  points  chosen: 

a)  The  Adams-Bashforth  method  of  order  ( ■  degree  of  the  polynomial  used)  p  + 1 
uses  interpolation  of  y’(x)  at  xk,  xk  \  xk  2,  xk  3, ...  xk  p 


b)  The  Adams-Moulton  method  of  order  p  + 1  uses  interpolation  of  y'(x)  at  xk  +  1, 

Xk  Xk_1  Xk'^  xk"*  jjk-p  +  1 

The  first  few  instances  of  these  formulas  are  the  following: 

Adams  -  Bashforth  formulas 
yk+1  =  yk  +  hy'k  (Euler’s  method) 

yk+i  =  yk  +  (h/2)(3y'k-y'k.1> 

yk+i  =  yk  +  (h/12)(23y'k  -  16y'k  j  +  *y'k.2) 

Adams  -  Moulton  formulas 

order 

1  yk+i  =yk  +  hy'k+i 

2  yk+i  =  yk  +  (h/2)(3y'k  +  1  -  y'k) 

3  yk+i  =  yk  +  (h/12)(5y'k  +  1  -  8y'k  -  y'kl) 

5.  Getting  by  Local  Extrema 

A  lot  of  problems  scientists  deal  with,  are  formulated  as  optimization  problems. 
The  solutions  to  these  problems  can  subsequently  be  visualized  as  extrema  (maxima 
or  minima)  of  a  relevant  function.  One  usually  tries,  starting  from  the  initial 
problem,  to  come  up  with  an  equivalent  optimization  problem  that  has  only  one 
extremum  (convexity  hypothesis).  But  this  cannot  always  be  the  case.  In  fact,  most 
non-linear  problems  have  equivalent  minimization  problems  with  more  than  one 
solutions  (set  of  extrema).  A  subset  of  the  latter  consists  of  the  global  extrema;  the 
rest  of  this  set  are  points  corresponding  to  local  extrema.  Application  of  the  above 
mentioned  methods  in  such  a  case  will  definitely  give  a  solution;  however  it  will  not, 
in  general,  be  the  one  corresponding  to  the  global  extremum. 

If  we  can  find  all  of  the  solutions  of  a  problem,  then  in  many  circumstances  we 
can  determine  the  global  extremum.  The  underlying  idea  is  conceptually  simple:  one 
finds  all  of  the  solutions  for  the  initial  problem;  assuming  that  their  number  is  not 
large,  one  can  sort  them  and  find  the  one  that  corresponds  to  the  desired  extremum. 
The  key  point  is  to  find  all  of  the  solutions. 

In  order  to  this,  one  has  to  use  a  different  homotopy  than  the  ones  that  we  mentioned 
above,  called  the  all-solutions  homotopy  ([2],  [9]).  It  has  many  different  starting 
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points  in  H'HtO)  each  of  which  initiates  a  different  path.  Following  each  of  these 
paths  leads  to  each  one  of  the  different  solutions. 

The  difference  of  the  all-solutions  homotopy  from  the  previously  mentioned  ones 
is  that  now  the  Hilbert  space  is  based  on  the  complex  number  field.  In  other  words,  E 
is  now  of  the  form  Cm  where  m  is  a  (positive)  natural  number.  The  following  theorem 
provides  the  necessary  framework  that  supports  the  all -solutions  homotopy. 

Theorem:  Let  F:  Cm  -*  Cm ,  F€  C2  and  suppose  that  we  wish  to  find  all  solutions  of  the 
system: 

F(z)  =  0,  where  z€  Cm 

We  also  assume  that  F(z)  =  0  has  a  finite  number  of  solutions.  Suppose  now  that  the 
homotopy 

Hi(z,t)  =  (l-t)((zi)«i  -  1)  +  t*i(z),i  =  1, 2, 3, 4, ...  m 
is  regular  and  that  no  path  can  run  to  infinity  for  any  value  of  t  €  [to,  U].  Then  by 
starting  from  the  Q  =  many  solutions  of  H  HtO),  and  following  the 

corresponding  paths,  we  obtain  all  solutions  of  F(z)  =  0^ 

6.  Discussion 

To  relate  the  above  ideas  to  Computer  Vision  we  present  now  an  example 
showing  how  one  can  relate  the  solutions  to  the  Shape  from  Shading  problem 
through  Scale  Space.  Before  we  proceed,  let  us  first  recall  quickly  the  previous  work 
in  this  area  along  with  the  justification  for  our  choosing  this  specific  example. 


t  The  rest  of  the  idea  is  pretty  much  expected:  after  having  determined  all  the  specific  solutions,  one 
determines  which  of  the  pure  real  solutions  obtained  results  in  the  optimal  value  of  the  corresponding 
function. 

In  [2],  one  can  find  the  following  claim,  (we  quote  from  page  363): 

"...Independently  and  almost  simultaneously,  Drexler[1977, 

1978]  and  Garcia  and  Zangwill  [1979a,  1980b]  developed 
the  all-solutions  algorithm..." 

However,  the  same  algorithm  was  used  by  Wasserstrom  ([9D  in  order  to  find  the  roots  of  a  25th  degree 
polynomial,  in  1972 


In  almost  every  imaging  situation,  one  has  to  face  the  problem  of  scale;  the  image 
under  consideration  is  characterized  ([23])  by  i)  a  limited  extent  (the  outer  scale)  and 
ii)  a  limited  resolution  (the  inner  scale).  In  many  applications  the  inner  and  outer 
scales  are  set  by  the  imaged  object.  For  example  a  house  does  not  exist  at  the  scale  of 
the  bricks  making  up  its  walls,  nor  at  the  scale  of  the  city.  As  Koenderink  ([23]) 
points  out: 

"...  if  you  have  no  a  priori  reasons  to  look  for  certain 
features, then  you  cannot  decide  on  the  right  scale, 
except  for  certain  trivial  cases...” 

Hence,  if  one  aims  to  retain  all  of  the  existing  structure  and  still  be  able  to 
identify  objects  through  (de-)blurring,  one  must  consider  the  image  at  different 
levels  of  resolution  in  a  continuous  way.  Another  fact  that  pleads  for  this  approach  is 
the  ability  of  the  visual  system  to  "zoom  in”  on  the  right  range  of  scale  ([21],  [22]). 

From  the  above,  the  necessity  for  a  continuous  description  of  the  imaged  objects 
over  an  interval  of  resolutions  becomes  clear,  and  scale  space  parameterization  ([14]) 
seems  to  be  the  right  approach. 

Multigrid  algorithms  have  been  used  in  prior  work  ([24])  in  the  developing  of 
iterative  algorithms  for  a  large  number  of  Computer  Vision  problems  including 
shape  from  shading.  Although  this  methodology  is  characterized  by  two  attractive 
features  such  as  i)  faster  convergence  to  a  solution  and  ii)  the  computation  of 
mutually  consistent  visual  representations  over  a  range  of  resolutions,  it  still 
requires  an  a  priori  fixed  multigrid  structure  with  a  concrete  number  of  levels. 

The  real  challenge  ([23])  is  to  understand  and  treat  the  image  at  all  levels  of 
resolution  (  assuming  a  closed  interval  of  resolutions,  of  course  )  simultaneously  and 
not  as  an  unrelated  set  of  images  derived  at  different  scales. 

An  attempt  towards  this  direction  was  made  by  Koenderink  ([23]).  Motivated  by 
the  intuition  that  the  nature  of  the  heat  equation  ([25])  provides,  he  embedded  the 
original  image  in  a  one  parameter  family  of  "derived”  images,  generated  by  a 
diffusion  process,  and  studied  the  problem  by  considering  the  whole  family  of 
images.  The  involved  parameter  was  the  scale.  He  even  proceeded  further  by 
deriving  conditions  that  had  to  be  satisfied  if  one  wanted  to  guarantee  the  accuracy 
and  the  stability  (in  the  numerical  sense  )  of  the  representation. 

6.1  Shape  from  Shading  through  Scale  Space;  deriving  the  relation 
between  the  solutions 


Consider  the  shape  from  shading  problem  ([19],  [26],  [27]).  This  problem  can  be 
formulated  as  an  optimization  problem  as  follows: 

the  solution  to  the  problem  is  that  pair  of  functions  p,  q  that  minimize  the  value  of 
the  following  integral: 

F  =  j  ( A(E  -  R)2+  (p2  +  p2)  +  (q2  +  q2)  )dxdy  (16) 

where  p  =  p(x,y),  q  =  q(x,y)  and  (x,y)  are  the  image  plane  coordinates  for  any 
arbitrary  point.  In  the  above  formulation  of  the  problem  we  follow  the  approach 
taken  in  [27]. 

A  necessary  condition  for  this  integral  to  have  an  extremum  (in  this  case  a 
minimum)  is  given  by  the  corresponding  Euler  equations: 

(Fi(p,q),  F2(p,q))  =  0 

where, 

Ftfp.q)  =  0F/0p  -  0(0F/0px)/0x  -  0(0F/0pv)/0y  =  0(17) 

F2(p,q)  =  0F/0q  -  0(0F/0qx)/0x  -  0(0F/0qy)/0y  =  0(18) 

Taking  into  account  equation  (16),  we  can  write  equations  (17)  and  (18)  as: 

Fi(p,q)  =V2p  +  X(0R/0p)(E  -  R)  =  0(19) 

F2(p,q)  =V2q  +  X(0R/0q)(E  -  R)  =  0(20) 

Equations  (19)  and  (20)  describe  our  initial  non-linear  problem.  Considering  now  the 
following  homotopy 

H(x,o)  =  F(x,o)  =  (Fi(p,q,o),  F2(p,q,o))  = 

(  V2P  +  X(0R/0p)(E®Go-R),V2q  +  X(0R/0q)(E®Go-R) )  (21) 

( where  G0  =  ( l/2no2)e  -  (x2  +  y2)/2o2  the  Gaussian  distribution  centered  at  (0,0)  with  a 
standard  deviation  of  o,  x  =  (p,  q),  and  ®:  the  convolution  operator),  in  the  interval 
[0,  oo]  =  [oi,  oo],  we  have  from  (21),  that 

Hi(x,o)  =  V2P  +  X(0R/0p)(E®Go-R)  <22) 

H2(x,o)  =  V2q  +  X(0R/0q)(E®Go-R)  (23) 

Differentiation  leads  us  to: 

0Hi/0p  =  X(02R/0p2)(E  ®  G0  -  R)  —  X(0R/0p)2  (24) 

OH^Oq  =  X(02R/0q2)(E  ®  G0  -  R)  -  X(0R/0q)2  (25) 

0H i/0q  =  X(02R/0q0p)(E  ® G0  -  R)  -  X(0R/0q)(0R/0p)  (26) 

0H2/0P  =  X(02R/0p0q)(E  ®G0  -  R)  -  X(0R/0p)(0R/0q)  (27) 


6H  i/0o  =  A(0R/0p)(  -  2/o)(E®  G0  + 10) 
0H2/eo  =  A(0R/0q)(  -  2/oXE  ®  G0  +  Io) 


(28) 

(29) 


where 


/  = 
0 


(x-x)2  +  ( y  —  y  )2 
2o2 


e 


’2  ’2 
(x-x  )  +  (  y-y  ) 


2o 


E  (x’,  /)  dx’dy’ 


At  this  point  it  is  worth  mentioning  that  under  the  assumption  that  the  partial 
derivatives  0R/0p  and  0R/0q  exist  and  are  continuous,  the  order  of  differentiation  in 
02R/0p0q  does  not  matter,  i.e.  02R/0p0q  =  02R/0q0p.  From  the  above  ,  we  can  see  that 
the  Jacobian  is  equal  to: 


A(02R/9p2)(E®Go  -  R)  -  A(9R/9p)-  A(02R/0q9p)(E®Go  -  R)  -  A(0R/0q) 

(0R/0p) 


A(02R/0p9q)(E®Go  -  R)  -  A(0R/9p)  A(92Ry0q2)(E®Go  -  R)  -  A(0R/9q)2 

(0R/0q) 


A(9R/0p)(  -  2/o) 
(E®G0+  I0) 

A(0R/0qH  -  2/o) 
(E®G0  + 10) 


Parameterizing  with  respect  to  the  length  of  the  arc,  and  using  the  Basic  Differential 
Equation  approach,  as  involving  mild  requirements  (see  also  above  discussion)  the 
following  system  of  equations  results: 


dp/ds  = 


—  det 


A(0JR/0q0p)(E®Go  -  R)  -  A(0R/0q)  A(0R/0p)(  -  2/o) 

(0R/0p)  (E®G0  +  I0) 


A(02R/0qJ)(E®Go  -  R)  -  A(0R/0q)2  A(0R/0q)(  -  2/o) 

(E®G0+I0) 


(30) 


(31) 


dq/ds  = 


+  det 


A(02R/0p2)(E  ®G0  -  R)  -  A(0R/0p)2  A(0R;0p)(  -  2/o) 

(E®G0  +  I0) 


A(02R/0p0q)(E®  G0  -  R)  -  A(0R/0p)  A(0R/0q)<  -  2/o) 

(0R/0q)  (E®G0+I0) 


A(02R70p2)(E®  G0  -  R)  -  A(0R/0p)2 


do/ds  = 


—  det 


A(O2R/0p0q)(E®Go-  R)  -  A(0R/0p) 
(0R/0q) 


A(02R/0q0p)(E®Go  -  R)  -  A(0R/0q) 
(0R/0p) 

A(02R/0q2)(E®Go-  R)  -  A(0R/0q)2 


(32) 


The  above  derived  system  of  equations,  describes  the  relation  of  the  solutions  to 
the  shape  from  shading  problems  through  scale  space. 

Recalling  equation  (21),  we  observe  that  the  problem  to  be  solved  at  scale  oo  is: 
V2P  +  \(0R/0p)(E®GOo-R)  =0  (33a) 

V2q  +  X(0R/0q)(E®GOo-R)  =  0  (33b) 

i.e  the  shape  from  shading  problem  for  an  image  derived  from  the  original  by 
blurring  with  a  Gaussian  of  standard  deviation  oo.  Furthermore  at  scale  oj  =0,  the 
problem  to  be  solved  is  the  one  we  originally  had.  Denote  by  xo  and  xi  respectively, 
the  solutions  to  these  two  problems.  Equations  (30)  -  (32)  "rule”  the  path  linking  the 
solutions  xo  and  xj.  In  particular,  if  one  can  find  a  solution  to  equations  (33),  one  can 
then  track  the  solution  from  coarser  to  finer  scales  until  the  solution  xi  is  reached. 
The  problem  that  exists  though,  is  that  one  cannot  guarantee  that  such  an  initial 
solution  xo  can  always  be  found,  since  one  has  still  to  worry  about  issues  of 
convergence  of  the  method  in  use. 

7.  Conclusion  and  future  work 

In  the  above  we  presented  a  general  framework  for  the  solving  of  non-linear 
equations;  its  power  has  been  demonstrated  by  an  example  that  showed  how  one  can 
relate  the  solutions  to  the  Shape  from  Shading  problem  through  Scale  Space. 


We  also  saw  that  since  there  is  no  guarantee  that  the  initial  solution  solution  xo  can 
be  always  found,  relating  the  solutions  to  the  Shape  from  Shading  problem  through 
Scale  Space  seems  to  be  the  best  one  can  hope  for. 

However,  it  may  be  possible  to:  i)  either  derive  a  homotopy  (imbedding)  that 
renders  the  initial  problem  at  oo  always  trivially  solvable,  or  ii)  find  a  particular 
value  for  oo  for  which  the  initial  problem  is  always  easily  solvable  or  iii)  use  the 
additional  constraint  that  the  above  system  of  equations  provide  to  derive  a  more 
robust  algorithm  for  solving  the  problem  at  hand. 

I  am  currently  being  investigating  these  last  three  issues;  any  conclusion  at  this 
point  would  definitely  be  premature. 

It  should  be  clear  by  now,  that  the  above  approach  is  not  restricted  to  the  shape 
from  shading  problem  only,  but  can  similarly  and  easily  extended  to  all  of  the 
Computer  Vision  problems  that  are  described  by  a  set  of  non-linear  equations  and 
are  subject  to  scale  space  formulations. 
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